11 research outputs found

    Método de agrupamiento no supervisado para el procesamiento del lenguaje natural utilizando medidas de similitud asimétricas y propiedades paradigmáticas

    Get PDF
    Una de las tareas más comunes para el ser humano, pero de con una alta complejidad es la agrupación y clasificación. Por otro lado, la debilidad del ser humano es la capacidad de procesar altas cantidades de datos y de forma rápida, característica propia de los computadores. Hoy en día se generan grandes cantidades de datos en el Internet, datos de distintos tipos y con diferentes objetivos. Para esto se necesitan de algoritmos de agrupación que nos permitan identificar los distintos grupos y características de estos grupos, de forma automática sin conocimiento previo. Por otro lado, es importante definir con claridad qué medida de similitud se utilizará en el proceso de agrupación, la gran mayoría de las medidas de agrupación se enfocan en un aspecto simétrico. En la presente tesis se propone una novedosa medida de similitud asimétrica, Coeficiente d Similitud Unilateral Jaccard (uJaccard), similitud no es igual entre dos objetos uJaccard(a,b) ≠ uJaccard(b,a). Así también se presenta una similitud asimétrica con pesos Coeficiente Ponderado de Similitud Unilateral Jaccard, la cual mide el nivel de incertidumbre entre dos objetos. Así también en esta tesis se propone una nueva propiedad de grafos, la propiedad paradigmática la cual considera la equivalencia regular como característica fundamental y por último se propone un algoritmo de agrupación PaC, por sus siglas en inglés Paradigmatic Clustering, el cual incorpora la uJaccard y la propiedad paradigmática. Se ha realizado evaluaciones extensivas con datos pequeños, reales, sintéticos y se ha procesado 3 grandes corpus. Se ha demostrado que PaC es un algoritmo que sobre pasa los resultados de algoritmos de agrupación del estado del arte. Más aun PaC es un algoritmo capas de ser ejecutado de forma paralela, distribuida, incremental y en flujo, características que se necesitan para el procedimiento de grandes cantidades de datos y de constante generación de dato

    Psychometric computational thinking test

    No full text
    The recent widespread popularity of computational thinking (CT) has raised the need for a reliable method for assessing it. Recent CT tests focus on programming skills rather than the analytical ability and problem-solving processes in science, philosophy and other areas of knowledge. This poster presents the results (Test design) of an ongoing project that has developed a Psychometric Computational Thinking Test (PCTT) which has three phases: test design, test implementation and applying the test. In regards to the PCTT design, the reliability and validity of the test were based on content and construct validity which also includes its rating scales for its application. This work makes two contributions: (1) a standardized CT Test design incorporating psychometric techniques as well as computational techniques and (2) the inclusion of open-ended questions and their assessment with V of Aiken in order to validate responses. © 2018 Copyright held by the owner/author(s).Trabajo de investigació

    Unilateral Weighted Jaccard Coefficient for NLP

    No full text
    Similarity measures are essential to solve many pattern recognition problems such as classification, clustering, and retrieval problems. Various similarity measures are categorized in both syntactic and semantic relationships. In this paper we present a novel similarity, Unilateral Weighted Jaccard Coefficient (uwJaccard), which takes into consideration not only the space among two points but also the semantics among them in a distributional semantic model, the Unilateral Weighted Jaccard Coefficient provides a measure of uncertainty which will be able to measure the uncertainty among sentences such as "man bites dog" and "dog bites man". © 2015 IEEE.Trabajo de investigació

    Design of network infrastructure of a cloud data center for use in health sector

    No full text
    This article presents the design of the network infrastructure of a Data Center that meets the requirements arising from Cloud Computing, for use in the Health Sector of Arequipa city, focusing on network layer 2 and its dimensionality to meet the requirements of several health service applications. The network infrastructure dimensionality calculation is a complex challenge for an of the ground project, in this article we present a novel approach to solve this challenge. Copyright © 2015 for the individual papers by the papers' authors.Trabajo de investigació

    Paradigmatic Clustering for NLP

    No full text
    How can we retrieve meaningful information from a large and sparse graph?. Traditional approaches focus on generic clustering techniques and discovering dense cumulus in a network graph, however, they tend to omit interesting patterns such as the paradigmatic relations. In this paper, we propose a novel graph clustering technique modelling the relations of a node using the paradigmatic analysis. We exploit node's relations to extract its existing sets of signifiers. The newly found clusters represent a different view of a graph, which provides interesting insights into the structure of a sparse network graph. Our proposed algorithm PaC (Paradigmatic Clustering) for clustering graphs uses paradigmatic analysis supported by a asymmetric similarity, in contrast to traditional graph clustering methods, our algorithm yields worthy results in tasks of word-sense disambiguation. In addition we propose a novel paradigmatic similarity measure. Extensive experiments and empirical analysis are used to evaluate our algorithm on synthetic and real data. © 2015 IEEE.Trabajo de investigació

    Complete cone symmetric temporary NAT

    No full text
    The Network Address Translation (NAT) is a mechanism used almost for every user on the internet, primarily to alleviate the exhaustion of IPv4 address space by allowing multiple hosts to share a public/Internet address. The NAT allow to establish TCP communications if the communication start from internal NAT, but does not allow communication if it start from the public internet, external NAT. This is call The NAT traversal problems. It cause that communications among peers relay on a third intermediary computer for the whole communication. Been this a security issue as the third intermediary can get a copy of the communication and also make the communication slower as it need to go through the third computer. This is the case for any p2p, VoIP, live games among others internet applications. In this article we present a novel mechanism to establish a communication among peers in which peers are behind a NAT without using a third intermediary for the whole communication. © 2016 IEEE.Trabajo de investigació

    Clustering algorithm based on asymmetric similarity and paradigmatic features

    No full text
    Similarity measures are essential to solve many pattern recognition problems such as classification, clustering, and information retrieval. Various similarity measures are categorised in both syntactic and semantic relationships. In this paper, we present a novel similarity, unilateral Jaccard similarity coefficient (uJaccard), which does not only take into consideration the space among two points but also the semantics among them. How can we retrieve meaningful information from a large and sparse graph? Traditional approaches focus on generic clustering techniques for network graph. However, they tend to omit interesting patterns such as the paradigmatic relations. In this paper, we propose a novel graph clustering technique modelling the relations of a node using the paradigmatic analysis. Our proposed algorithm paradigmatic clustering (PaC) for graph clustering uses paradigmatic analysis supported by an asymmetric similarity using uJaccard. Extensive experiments and empirical analysis are used to evaluate our algorithm on synthetic and real data. Copyright © 2016 Inderscience Enterprises Ltd.Trabajo de investigació

    Comparing topics in CS syllabus with topics in CS research

    No full text
    This study quantifies and compares the computer security themes found in the ACM Computer Science curricula with the themes addressed in top-ranked computer security re- search conferences over the past six years. On the under- standing that current research should help set the agenda for course coverage, we use a strategic diagram to compare the research topics with the curriculum topics and identify specific future directions for the ACM CS curriculum and for computer security courses.Trabajo de investigació

    Unilateral Jaccard similarity coefficient

    No full text
    Similarity measures are essential to solve many pattern recognition problems such as classification, clustering, and retrieval problems. Various similarity measures are categorized in both syntactic and semantic relationships. In this paper we present a novel similarity, Unilateral Jaccard Similarity Coefficient (uJaccard), which doesn't only take into consideration the space among two points but also the semantics among them. Copyright © 2015 for the individual papers by the papers' authors.Trabajo de investigació

    AL-DDoS attack detection optimized with genetic algorithms

    No full text
    Application Layer DDoS (AL-DDoS) is a major danger for Internet information services, because these attacks are easily performed and implemented by attackers and are difficult to detect and stop using traditional firewalls. Managing to saturate physically and computationally the information services offered on the network. Directly harming legitimate users, to deal with this type of attacks in the network layer previous approaches propose to use a configurable statistical model and observed that when being optimized in various configuration parameters Using Genetic Algorithms was able to optimize the effectiveness to detect Network Layer DDoS (NL-DDoS), however this method is not enough to stop DDoS at the level of application because this level presents different characteristics, that is why we propose a new method Configurable and optimized for different scenarios of Attacks that effectively detect AL-DDoS. © Springer Nature Switzerland AG 2018.Trabajo de investigació
    corecore